Scale-invariant structure of strongly conserved sequence in genomic intersections and alignments.
نویسندگان
چکیده
A power-law distribution of the length of perfectly conserved sequence from mouse/human whole-genome intersection and alignment is exhibited. Spatial correlations of these elements within the mouse genome are studied. It is argued that these power-law distributions and correlations are comprised in part by functional noncoding sequence and ought to be accounted for in estimating the statistical significance of apparent sequence conservation. These inter-genomic correlations of conservation are placed in the context of previously observed intra-genomic correlations, and their possible origins and consequences are discussed.
منابع مشابه
Comparative bioinformatics analysis of a wild diploid Gossypium with two cultivated allotetraploid species
Background: Gossypium thurberi is a wild diploid species that has been used to improve cultivated allotetraploid cotton. G. thurberi belongs to D genome, which is an important wild bio-source for the cotton breeding and genetic research. To a certain degree, chloroplast DNA sequence information are a versatile tool for species identification and phylogenetic implications in plants. Different ch...
متن کاملEfficient large-scale sequence comparison by locality-sensitive hashing
MOTIVATION Comparison of multimegabase genomic DNA sequences is a popular technique for finding and annotating conserved genome features. Performing such comparisons entails finding many short local alignments between sequences up to tens of megabases in length. To process such long sequences efficiently, existing algorithms find alignments by expanding around short runs of matching bases with ...
متن کاملComparison of the Lipophosphoglycan 3 Gene of the Lizard and Mammalian Leishmania: A Homology Modeling
Background: Lipophosphoglycan 3 (LPG3) is required for the LPG assembly, a well known virulent molecule. In this study, the LPG3 gene of the lizard and mammalian Leishmania species were cloned and sequenced. A three-dimensional structure (3D) for the target sequence was also predicted by comparative (homology) modeling. Materials and Methods: An optimization PCR amplification was performed o...
متن کاملddbRNA: detection of conserved secondary structures in multiple alignments
MOTIVATION Structured non-coding RNAs (ncRNAs) have a very important functional role in the cell. No distinctive general features common to all ncRNA have yet been discovered. This makes it difficult to design computational tools able to detect novel ncRNAs in the genomic sequence. RESULTS We devised an algorithm able to detect conserved secondary structures in both pairwise and multiple DNA ...
متن کاملMultiple sequence alignment: in pursuit of homologous DNA positions.
DNA sequence alignment is a prerequisite to virtually all comparative genomic analyses, including the identification of conserved sequence motifs, estimation of evolutionary divergence between sequences, and inference of historical relationships among genes and species. While it is mere common sense that inaccuracies in multiple sequence alignments can have detrimental effects on downstream ana...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Proceedings of the National Academy of Sciences of the United States of America
دوره 103 35 شماره
صفحات -
تاریخ انتشار 2006